Overview

Dataset statistics

Number of variables21
Number of observations17379
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.4 MiB
Average record size in memory449.0 B

Variable types

NUM11
CAT7
BOOL3

Reproduction

Analysis started2020-04-02 08:59:07.238055
Analysis finished2020-04-02 08:59:51.622430
Versionpandas-profiling v2.5.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
dteday has a high cardinality: 731 distinct values High cardinality
atemp is highly correlated with tempHigh Correlation
temp is highly correlated with atempHigh Correlation
cnt is highly correlated with registeredHigh Correlation
registered is highly correlated with cntHigh Correlation
season_int is highly correlated with seasonHigh Correlation
season is highly correlated with season_intHigh Correlation
weathersit_int is highly correlated with weathersitHigh Correlation
weathersit is highly correlated with weathersit_intHigh Correlation
dteday only contains datetime values, but is categorical. Consider applying pd.to_datetime()Type
hr has 726 (4.2%) zeros Zeros
windspeed has 2180 (12.5%) zeros Zeros
casual has 1581 (9.1%) zeros Zeros
weekday_int has 2502 (14.4%) zeros Zeros

Variables

instant
Real number (ℝ≥0)

UNIFORM
UNIQUE
Distinct count17379
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8690
Minimum1
Maximum17379
Zeros0
Zeros (%)0.0%
Memory size135.9 KiB

Quantile statistics

Minimum1
5-th percentile869.9
Q14345.5
median8690
Q313034.5
95-th percentile16510.1
Maximum17379
Range17378
Interquartile range (IQR)8689

Descriptive statistics

Standard deviation5017.0295
Coefficient of variation (CV)0.5773336593
Kurtosis-1.2
Mean8690
Median Absolute Deviation (MAD)4344.749986
Skewness0
Sum151023510
Variance25170585
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.0000e+00 1.7379e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
4727 1 < 0.1%
 
12947 1 < 0.1%
 
14994 1 < 0.1%
 
8849 1 < 0.1%
 
10896 1 < 0.1%
 
17037 1 < 0.1%
 
4743 1 < 0.1%
 
6790 1 < 0.1%
 
645 1 < 0.1%
 
Other values (17369) 17369 99.9%
 
ValueCountFrequency (%) 
1 1 < 0.1%
 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 
5 1 < 0.1%
 
ValueCountFrequency (%) 
17379 1 < 0.1%
 
17378 1 < 0.1%
 
17377 1 < 0.1%
 
17376 1 < 0.1%
 
17375 1 < 0.1%
 

dteday
Categorical

HIGH CARDINALITY
TYPE DATE
UNIFORM
Distinct count731
Unique (%)4.2%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
2012-09-21
 
24
2012-12-11
 
24
2012-08-04
 
24
2011-12-10
 
24
2012-11-04
 
24
Other values (726)
17259
ValueCountFrequency (%) 
2012-09-21 24 0.1%
 
2012-12-11 24 0.1%
 
2012-08-04 24 0.1%
 
2011-12-10 24 0.1%
 
2012-11-04 24 0.1%
 
2012-01-09 24 0.1%
 
2012-03-16 24 0.1%
 
2011-12-18 24 0.1%
 
2012-08-09 24 0.1%
 
2012-09-13 24 0.1%
 
Other values (721) 17139 98.6%
 

Length

Max length10
Mean length10
Min length10
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Dash_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

season
Categorical

HIGH CORRELATION
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
Summer
4496
Spring
4409
Winter
4242
Fall
4232
ValueCountFrequency (%) 
Summer 4496 25.9%
 
Spring 4409 25.4%
 
Winter 4242 24.4%
 
Fall 4232 24.4%
 

Length

Max length6
Mean length5.51297543
Min length4
ValueCountFrequency (%) 
Lowercase_Letter 11 78.6%
 
Uppercase_Letter 3 21.4%
 
ValueCountFrequency (%) 
Latin 14 100.0%
 
ValueCountFrequency (%) 
ASCII 14 100.0%
 

yr
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
1
8734
0
8645
ValueCountFrequency (%) 
1 8734 50.3%
 
0 8645 49.7%
 

month
Categorical

Distinct count12
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
May
 
1488
July
 
1488
December
 
1483
August
 
1475
March
 
1473
Other values (7)
9972
ValueCountFrequency (%) 
May 1488 8.6%
 
July 1488 8.6%
 
December 1483 8.5%
 
August 1475 8.5%
 
March 1473 8.5%
 
October 1451 8.3%
 
June 1440 8.3%
 
September 1437 8.3%
 
November 1437 8.3%
 
April 1437 8.3%
 
Other values (2) 2770 15.9%
 

Length

Max length9
Mean length6.142873583
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 18 69.2%
 
Uppercase_Letter 8 30.8%
 
ValueCountFrequency (%) 
Latin 26 100.0%
 
ValueCountFrequency (%) 
ASCII 26 100.0%
 

hr
Real number (ℝ≥0)

ZEROS
Distinct count24
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.54675183
Minimum0
Maximum23
Zeros726
Zeros (%)4.2%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q318
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.914405095
Coefficient of variation (CV)0.5988181957
Kurtosis-1.198020588
Mean11.54675183
Median Absolute Deviation (MAD)5.988232784
Skewness-0.01067990952
Sum200671
Variance47.80899782
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 22.5 23. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
16 730 4.2%
 
17 730 4.2%
 
15 729 4.2%
 
13 729 4.2%
 
14 729 4.2%
 
22 728 4.2%
 
18 728 4.2%
 
19 728 4.2%
 
20 728 4.2%
 
21 728 4.2%
 
Other values (14) 10092 58.1%
 
ValueCountFrequency (%) 
0 726 4.2%
 
1 724 4.2%
 
2 715 4.1%
 
3 697 4.0%
 
4 697 4.0%
 
ValueCountFrequency (%) 
23 728 4.2%
 
22 728 4.2%
 
21 728 4.2%
 
20 728 4.2%
 
19 728 4.2%
 

holiday
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
0
16879
1
 
500
ValueCountFrequency (%) 
0 16879 97.1%
 
1 500 2.9%
 

weekday
Categorical

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
Saturday
2512
Sunday
2502
Friday
2487
Monday
2479
Wednesday
2475
Other values (2)
4924
ValueCountFrequency (%) 
Saturday 2512 14.5%
 
Sunday 2502 14.4%
 
Friday 2487 14.3%
 
Monday 2479 14.3%
 
Wednesday 2475 14.2%
 
Thursday 2471 14.2%
 
Tuesday 2453 14.1%
 

Length

Max length9
Mean length7.14183785
Min length6
ValueCountFrequency (%) 
Lowercase_Letter 12 70.6%
 
Uppercase_Letter 5 29.4%
 
ValueCountFrequency (%) 
Latin 17 100.0%
 
ValueCountFrequency (%) 
ASCII 17 100.0%
 

workingday
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
1
11865
0
5514
ValueCountFrequency (%) 
1 11865 68.3%
 
0 5514 31.7%
 

weathersit
Categorical

HIGH CORRELATION
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
Clear
11413
Misty Cloudy
4544
Light Snow
 
1419
Thunderstorm
 
3
ValueCountFrequency (%) 
Clear 11413 65.7%
 
Misty Cloudy 4544 26.1%
 
Light Snow 1419 8.2%
 
Thunderstorm 3 < 0.1%
 

Length

Max length12
Mean length7.239714598
Min length5
ValueCountFrequency (%) 
Lowercase_Letter 16 72.7%
 
Uppercase_Letter 5 22.7%
 
Space_Separator 1 4.5%
 
ValueCountFrequency (%) 
Latin 21 95.5%
 
Common 1 4.5%
 
ValueCountFrequency (%) 
ASCII 22 100.0%
 

temp
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count50
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4969871684
Minimum0.02
Maximum1
Zeros0
Zeros (%)0.0%
Memory size135.9 KiB

Quantile statistics

Minimum0.02
5-th percentile0.2
Q10.34
median0.5
Q30.66
95-th percentile0.8
Maximum1
Range0.98
Interquartile range (IQR)0.32

Descriptive statistics

Standard deviation0.1925561212
Coefficient of variation (CV)0.3874468668
Kurtosis-0.9418442041
Mean0.4969871684
Median Absolute Deviation (MAD)0.1651747656
Skewness-0.006020883348
Sum8637.14
Variance0.03707785983
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.02 0.09 0.13 0.15 0.17 ... 0.83 0.87 0.93 0.97 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.62 726 4.2%
 
0.66 693 4.0%
 
0.64 692 4.0%
 
0.7 690 4.0%
 
0.6 675 3.9%
 
0.36 671 3.9%
 
0.34 645 3.7%
 
0.3 641 3.7%
 
0.4 614 3.5%
 
0.32 611 3.5%
 
Other values (40) 10721 61.7%
 
ValueCountFrequency (%) 
0.02 17 0.1%
 
0.04 16 0.1%
 
0.06 16 0.1%
 
0.08 17 0.1%
 
0.1 51 0.3%
 
ValueCountFrequency (%) 
1 1 < 0.1%
 
0.98 1 < 0.1%
 
0.96 16 0.1%
 
0.94 17 0.1%
 
0.92 49 0.3%
 

atemp
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count65
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4757751021
Minimum0
Maximum1
Zeros2
Zeros (%)< 0.1%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile0.2121
Q10.3333
median0.4848
Q30.6212
95-th percentile0.7424
Maximum1
Range1
Interquartile range (IQR)0.2879

Descriptive statistics

Standard deviation0.1718502156
Coefficient of variation (CV)0.3612005228
Kurtosis-0.8454118948
Mean0.4757751021
Median Absolute Deviation (MAD)0.1453236352
Skewness-0.09042885856
Sum8268.4955
Variance0.02953249661
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.0682 0.11365 0.17425 0.20455 ... 0.79545 0.82575 0.85605 0.9015 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.6212 988 5.7%
 
0.5152 618 3.6%
 
0.4091 614 3.5%
 
0.3333 600 3.5%
 
0.6667 593 3.4%
 
0.6061 588 3.4%
 
0.5303 579 3.3%
 
0.5 575 3.3%
 
0.4545 559 3.2%
 
0.303 549 3.2%
 
Other values (55) 11116 64.0%
 
ValueCountFrequency (%) 
0 2 < 0.1%
 
0.0152 4 < 0.1%
 
0.0303 8 < 0.1%
 
0.0455 9 0.1%
 
0.0606 14 0.1%
 
ValueCountFrequency (%) 
1 1 < 0.1%
 
0.9848 2 < 0.1%
 
0.9545 1 < 0.1%
 
0.9242 5 < 0.1%
 
0.9091 5 < 0.1%
 

hum
Real number (ℝ≥0)

Distinct count89
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6272288394
Minimum0
Maximum1
Zeros22
Zeros (%)0.1%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile0.31
Q10.48
median0.63
Q30.78
95-th percentile0.93
Maximum1
Range1
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.1929298341
Coefficient of variation (CV)0.3075908216
Kurtosis-0.8261167359
Mean0.6272288394
Median Absolute Deviation (MAD)0.163311399
Skewness-0.1112871494
Sum10900.61
Variance0.03722192087
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.04 0.145 0.185 0.225 ... 0.895 0.925 0.95 0.985 1. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.88 657 3.8%
 
0.83 630 3.6%
 
0.94 560 3.2%
 
0.87 488 2.8%
 
0.7 430 2.5%
 
0.66 388 2.2%
 
0.65 387 2.2%
 
0.69 359 2.1%
 
0.55 352 2.0%
 
0.74 341 2.0%
 
Other values (79) 12787 73.6%
 
ValueCountFrequency (%) 
0 22 0.1%
 
0.08 1 < 0.1%
 
0.1 1 < 0.1%
 
0.12 1 < 0.1%
 
0.13 1 < 0.1%
 
ValueCountFrequency (%) 
1 270 1.6%
 
0.97 1 < 0.1%
 
0.96 3 < 0.1%
 
0.94 560 3.2%
 
0.93 331 1.9%
 

windspeed
Real number (ℝ≥0)

ZEROS
Distinct count30
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1900976063
Minimum0
Maximum0.8507
Zeros2180
Zeros (%)12.5%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10.1045
median0.194
Q30.2537
95-th percentile0.4179
Maximum0.8507
Range0.8507
Interquartile range (IQR)0.1492

Descriptive statistics

Standard deviation0.1223402286
Coefficient of variation (CV)0.6435653291
Kurtosis0.5908204107
Mean0.1900976063
Median Absolute Deviation (MAD)0.09631231746
Skewness0.5749052035
Sum3303.7063
Variance0.01496713153
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.0448 0.09705 0.1194 0.20895 ... 0.4776 0.5373 0.597 0.67165 0.8507 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 2180 12.5%
 
0.1343 1738 10.0%
 
0.1642 1695 9.8%
 
0.194 1657 9.5%
 
0.1045 1617 9.3%
 
0.2239 1513 8.7%
 
0.0896 1425 8.2%
 
0.2537 1295 7.5%
 
0.2836 1048 6.0%
 
0.2985 808 4.6%
 
Other values (20) 2403 13.8%
 
ValueCountFrequency (%) 
0 2180 12.5%
 
0.0896 1425 8.2%
 
0.1045 1617 9.3%
 
0.1343 1738 10.0%
 
0.1642 1695 9.8%
 
ValueCountFrequency (%) 
0.8507 2 < 0.1%
 
0.8358 1 < 0.1%
 
0.806 2 < 0.1%
 
0.7761 1 < 0.1%
 
0.7463 2 < 0.1%
 

casual
Real number (ℝ≥0)

ZEROS
Distinct count322
Unique (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.67621842
Minimum0
Maximum367
Zeros1581
Zeros (%)9.1%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q14
median17
Q348
95-th percentile138.1
Maximum367
Range367
Interquartile range (IQR)44

Descriptive statistics

Standard deviation49.30503039
Coefficient of variation (CV)1.382013918
Kurtosis7.571001747
Mean35.67621842
Median Absolute Deviation (MAD)34.13996034
Skewness2.499236891
Sum620017
Variance2430.986021
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.5 3.5 5.5 ... 187.5 240.5 275.5 311.5 367. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1581 9.1%
 
1 1082 6.2%
 
2 798 4.6%
 
3 697 4.0%
 
4 561 3.2%
 
5 509 2.9%
 
6 448 2.6%
 
7 405 2.3%
 
8 377 2.2%
 
9 348 2.0%
 
Other values (312) 10573 60.8%
 
ValueCountFrequency (%) 
0 1581 9.1%
 
1 1082 6.2%
 
2 798 4.6%
 
3 697 4.0%
 
4 561 3.2%
 
ValueCountFrequency (%) 
367 1 < 0.1%
 
362 1 < 0.1%
 
361 1 < 0.1%
 
357 1 < 0.1%
 
356 1 < 0.1%
 

registered
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count776
Unique (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean153.7868692
Minimum0
Maximum886
Zeros24
Zeros (%)0.1%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile4
Q134
median115
Q3220
95-th percentile465
Maximum886
Range886
Interquartile range (IQR)186

Descriptive statistics

Standard deviation151.3572859
Coefficient of variation (CV)0.9842016207
Kurtosis2.750017757
Mean153.7868692
Median Absolute Deviation (MAD)114.3961551
Skewness1.557904226
Sum2672662
Variance22909.028
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 5.000e-01 2.500e+00 6.500e+00 9.500e+00 ... 4.925e+02 5.515e+02 7.695e+02 8.135e+02 8.860e+02], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 307 1.8%
 
3 294 1.7%
 
5 287 1.7%
 
6 266 1.5%
 
2 245 1.4%
 
1 201 1.2%
 
7 200 1.2%
 
8 190 1.1%
 
9 178 1.0%
 
11 140 0.8%
 
Other values (766) 15071 86.7%
 
ValueCountFrequency (%) 
0 24 0.1%
 
1 201 1.2%
 
2 245 1.4%
 
3 294 1.7%
 
4 307 1.8%
 
ValueCountFrequency (%) 
886 1 < 0.1%
 
885 1 < 0.1%
 
876 2 < 0.1%
 
871 1 < 0.1%
 
860 1 < 0.1%
 

cnt
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count869
Unique (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean189.4630876
Minimum1
Maximum977
Zeros0
Zeros (%)0.0%
Memory size135.9 KiB

Quantile statistics

Minimum1
5-th percentile5
Q140
median142
Q3281
95-th percentile563.1
Maximum977
Range976
Interquartile range (IQR)241

Descriptive statistics

Standard deviation181.3875991
Coefficient of variation (CV)0.9573769823
Kurtosis1.417203281
Mean189.4630876
Median Absolute Deviation (MAD)142.3998489
Skewness1.277411604
Sum3292679
Variance32901.4611
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 7.5 11.5 17.5 ... 596.5 693.5 750.5 900.5 977. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5 260 1.5%
 
6 236 1.4%
 
4 231 1.3%
 
3 224 1.3%
 
2 208 1.2%
 
7 198 1.1%
 
8 182 1.0%
 
1 158 0.9%
 
10 155 0.9%
 
11 147 0.8%
 
Other values (859) 15380 88.5%
 
ValueCountFrequency (%) 
1 158 0.9%
 
2 208 1.2%
 
3 224 1.3%
 
4 231 1.3%
 
5 260 1.5%
 
ValueCountFrequency (%) 
977 1 < 0.1%
 
976 1 < 0.1%
 
970 1 < 0.1%
 
968 1 < 0.1%
 
967 1 < 0.1%
 

season_int
Categorical

HIGH CORRELATION
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
3
4496
2
4409
1
4242
4
4232
ValueCountFrequency (%) 
3 4496 25.9%
 
2 4409 25.4%
 
1 4242 24.4%
 
4 4232 24.4%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

month_int
Real number (ℝ≥0)

Distinct count12
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.537775476
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size135.9 KiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.438775714
Coefficient of variation (CV)0.5259855935
Kurtosis-1.201878197
Mean6.537775476
Median Absolute Deviation (MAD)2.982815041
Skewness-0.009253248383
Sum113620
Variance11.82517841
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 2.5 11.5 12. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
7 1488 8.6%
 
5 1488 8.6%
 
12 1483 8.5%
 
8 1475 8.5%
 
3 1473 8.5%
 
10 1451 8.3%
 
6 1440 8.3%
 
11 1437 8.3%
 
9 1437 8.3%
 
4 1437 8.3%
 
Other values (2) 2770 15.9%
 
ValueCountFrequency (%) 
1 1429 8.2%
 
2 1341 7.7%
 
3 1473 8.5%
 
4 1437 8.3%
 
5 1488 8.6%
 
ValueCountFrequency (%) 
12 1483 8.5%
 
11 1437 8.3%
 
10 1451 8.3%
 
9 1437 8.3%
 
8 1475 8.5%
 

weekday_int
Real number (ℝ≥0)

ZEROS
Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.003682605
Minimum0
Maximum6
Zeros2502
Zeros (%)14.4%
Memory size135.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.005771456
Coefficient of variation (CV)0.6677707733
Kurtosis-1.255996891
Mean3.003682605
Median Absolute Deviation (MAD)1.720868973
Skewness-0.002998221376
Sum52201
Variance4.023119134
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 5.5 6. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
6 2512 14.5%
 
0 2502 14.4%
 
5 2487 14.3%
 
1 2479 14.3%
 
3 2475 14.2%
 
4 2471 14.2%
 
2 2453 14.1%
 
ValueCountFrequency (%) 
0 2502 14.4%
 
1 2479 14.3%
 
2 2453 14.1%
 
3 2475 14.2%
 
4 2471 14.2%
 
ValueCountFrequency (%) 
6 2512 14.5%
 
5 2487 14.3%
 
4 2471 14.2%
 
3 2475 14.2%
 
2 2453 14.1%
 

weathersit_int
Categorical

HIGH CORRELATION
Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size135.9 KiB
1
11413
2
4544
3
 
1419
4
 
3
ValueCountFrequency (%) 
1 11413 65.7%
 
2 4544 26.1%
 
3 1419 8.2%
 
4 3 < 0.1%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

instantdtedayseasonyrmonthhrholidayweekdayworkingdayweathersittempatemphumwindspeedcasualregisteredcntseason_intmonth_intweekday_intweathersit_int
012011-01-01Winter0January00Saturday0Clear0.240.28790.810.0000313161161
122011-01-01Winter0January10Saturday0Clear0.220.27270.800.0000832401161
232011-01-01Winter0January20Saturday0Clear0.220.27270.800.0000527321161
342011-01-01Winter0January30Saturday0Clear0.240.28790.750.0000310131161
452011-01-01Winter0January40Saturday0Clear0.240.28790.750.00000111161
562011-01-01Winter0January50Saturday0Misty Cloudy0.240.25760.750.08960111162
672011-01-01Winter0January60Saturday0Clear0.220.27270.800.00002021161
782011-01-01Winter0January70Saturday0Clear0.200.25760.860.00001231161
892011-01-01Winter0January80Saturday0Clear0.240.28790.750.00001781161
9102011-01-01Winter0January90Saturday0Clear0.320.34850.760.000086141161

Last rows

instantdtedayseasonyrmonthhrholidayweekdayworkingdayweathersittempatemphumwindspeedcasualregisteredcntseason_intmonth_intweekday_intweathersit_int
17369173702012-12-31Winter1December140Monday1Misty Cloudy0.280.27270.450.22396218524711212
17370173712012-12-31Winter1December150Monday1Misty Cloudy0.280.28790.450.13436924631511212
17371173722012-12-31Winter1December160Monday1Misty Cloudy0.260.25760.480.19403018421411212
17372173732012-12-31Winter1December170Monday1Misty Cloudy0.260.28790.480.08961415016411212
17373173742012-12-31Winter1December180Monday1Misty Cloudy0.260.27270.480.13431011212211212
17374173752012-12-31Winter1December190Monday1Misty Cloudy0.260.25760.600.16421110811911212
17375173762012-12-31Winter1December200Monday1Misty Cloudy0.260.25760.600.16428818911212
17376173772012-12-31Winter1December210Monday1Clear0.260.25760.600.16427839011211
17377173782012-12-31Winter1December220Monday1Clear0.260.27270.560.134313486111211
17378173792012-12-31Winter1December230Monday1Clear0.260.27270.650.134312374911211